AITopics | depth information

Considering the diversity of sign languages across geographical regions, developing region-specific ISLR datasets is crucial for supporting communication and research. Auslan, as a sign language specific to Australia, still lacks a dedicated large-scale word-level dataset for the ISLR task.

artificial intelligence, machine learning, recognition, (13 more...)

Neural Information Processing Systems

Country:

Europe > Austria > Vienna (0.14)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
(21 more...)

Genre: Research Report (0.93)

Industry:

Education > Curriculum > Subject-Specific Education (0.99)
Health & Medicine (0.93)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)

Add feedback

Country:

Europe > Austria > Vienna (0.14)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
(22 more...)

Genre: Research Report (0.93)

Industry:

Health & Medicine (0.93)
Education > Curriculum (0.37)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

dbdc7a9779ce0278c6e43b62c7e97759-Supplemental-Conference.pdf

Neural Information Processing SystemsOct-9-2025, 09:10:17 GMT

artificial intelligence, estimation, machine learning, (17 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

ViCA-NeRF: View-Consistency-Aware 3D Editing of Neural Radiance Fields

Neural Information Processing SystemsOct-9-2025, 06:37:22 GMT

Incorporating these two strategies, our ViCA-NeRF operates in two stages.

artificial intelligence, editing, nerf, (18 more...)

Neural Information Processing Systems

Country:

Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)
North America > United States > Illinois > Champaign County > Urbana (0.04)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Vision (1.00)

Add feedback

SD-VLM: Spatial Measuring and Understanding with Depth-Encoded Vision-Language Models

Chen, Pingyi, Lou, Yujing, Cao, Shen, Guo, Jinhui, Fan, Lubin, Wu, Yue, Yang, Lin, Ma, Lizhuang, Ye, Jieping

arXiv.org Artificial IntelligenceSep-23-2025

While vision language models (VLMs) excel in 2D semantic visual understanding, their ability to quantitatively reason about 3D spatial relationships remains under-explored, due to the deficiency of 2D images' spatial representation ability. In this paper, we analyze the problem hindering VLMs' spatial understanding abilities and propose SD-VLM, a novel framework that significantly enhances fundamental spatial perception abilities of VLMs through two key contributions: (1) propose Massive Spatial Measuring and Understanding (MSMU) dataset with precise spatial annotations, and (2) introduce a simple depth positional encoding method strengthening VLMs' spatial awareness. MSMU dataset covers massive quantitative spatial tasks with 700K QA pairs, 2.5M physical numerical annotations, and 10K chain-of-thought augmented samples. We have trained SD-VLM, a strong generalist VLM which shows superior quantitative spatial measuring and understanding capability. SD-VLM not only achieves state-of-the-art performance on our proposed MSMU-Bench, but also shows spatial generalization abilities on other spatial understanding benchmarks including Q-Spatial and SpatialRGPT-Bench. Extensive experiments demonstrate that SD-VLM outperforms GPT-4o and Intern-VL3-78B by 26.91% and 25.56% respectively on MSMU-Bench. Code and models are released at https://github.com/cpystan/SD-VLM.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2509.17664

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

GPS Denied IBVS-Based Navigation and Collision Avoidance of UAV Using a Low-Cost RGB Camera

Wang, Xiaoyu, Tan, Yan Rui, Leong, William, Huang, Sunan, Teo, Rodney, Xiang, Cheng

arXiv.org Artificial IntelligenceSep-23-2025

Abstract-- This paper proposes an image-based visual ser-voing (IBVS) framework for UA V navigation and collision avoidance using only an RGB camera. While UA V navigation has been extensively studied, it remains challenging to apply IBVS in missions involving multiple visual targets and collision avoidance. The proposed method achieves navigation without explicit path planning, and collision avoidance is realized through AI-based monocular depth estimation from RGB images. Unlike approaches that rely on stereo cameras or external workstations, our framework runs fully onboard a Jetson platform, ensuring a self-contained and deployable system. Experimental results validate that the UA V can navigate across multiple AprilT ags and avoid obstacles effectively in GPS-denied environments. I. INTRODUCTION Most UA V applications depend on position estimation provided by global positioning systems (GPS). However, GPS is often unavailable in indoor, mountainous, or forest environments, motivating the use of computer vision for UA V navigation. This paper focuses on image-based visual servoing (IBVS) with an onboard RGB camera.

artificial intelligence, avoidance, collision avoidance, (16 more...)

arXiv.org Artificial Intelligence

2509.17435

Country: Asia > Singapore (0.14)

Genre: Research Report (0.64)

Industry:

Transportation > Air (0.47)
Aerospace & Defense > Aircraft (0.47)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (0.46)

Add feedback

Filters

Collaborating Authors

depth information

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

dbdc7a9779ce0278c6e43b62c7e97759-Supplemental-Conference.pdf

c1e2faff6f588870935f114ebe04a3e5-Paper-Conference.pdf

MM-WLAuslan: Multi-View Multi-Modal Word-Level Australian Sign Language Recognition Dataset

b87517992f7dce71b674976b280257d2-AuthorFeedback.pdf

5c882988ce5fac487974ee4f415b96a9-Supplemental-Conference.pdf

812c59ba55c03a68a10c25017bdb696e-Paper-Datasets_and_Benchmarks_Track.pdf

dbdc7a9779ce0278c6e43b62c7e97759-Supplemental-Conference.pdf

ViCA-NeRF: View-Consistency-Aware 3D Editing of Neural Radiance Fields

SD-VLM: Spatial Measuring and Understanding with Depth-Encoded Vision-Language Models

GPS Denied IBVS-Based Navigation and Collision Avoidance of UAV Using a Low-Cost RGB Camera